Introduction

My research question for this project is, "How are the locations of COVID-19 testing centers in Chicago related to COVID cases and demographic patterns?" I aimed to discover whether the distribution of COVID testing sites was related to COVID case rates or demographic trends (proxied by racial demographics).

I conducted this research using data from Chicago's Open Data portal. My data layers display COVID case rates by Chicago zip code, racial breakdowns by Census tract, and the locations of COVID testing sites. I loaded in sidewalk data that I hoped to use to create service areas of COVID testing sites, but I ultimately had to use buffers of the testing sites due to time and open source software constraints.

My method of statistical analysis was ANOVA (analysis of variance). This method illustrates the statistical significance of mean-differences between groups.

Background Literature

Literature throughout the pandemic has shown disparities in the impacts of COVID-19, particulary across racial and ethnic lines. A recent study looking at 6 cities in the U.S. (including Chicago) showed that a 10 percentage point increase in a zip code's Black population is associated with 9.2 additional COVID cases per 10,000 people, and a similar increase in Hispanic residents is associated with 20.6 additional cases. Potential explanations for this increase are disparities in types of jobs (frontline employment vs. employment that can move online), ability (or lack thereof) to move outside of the city during COVID peaks, access to healthcare and testing during the pandemic, and longterm preexisting health disparities (Benitez, Courtemanche, & Yelowitz 2020).

It is also important to examine literature about the history of segregation in the United States and Chicago specifically. Chicago has been used as a model for prototypical American cities (Park and Burgess 1925) and is quite segregated between its predominantly-white suburbs and its predominantly-Black "South Side". In recent decades, however, Chicago has become more diverse overall and has undergone complex racial change including an increase in Hispanic/Latino identifying residents (Onésimo Sandoval 2011).

One resource about segregation in the US that I find particularly interesting and important is the web map "Mapping Inequality" that was developed by scholars at the University of Richmond, Virginia Tech University, and the University of Maryland. The interactive mapping tool can be found here: https://dsl.richmond.edu/panorama/redlining/#loc=5/39.1/-94.58. Below is a screenshot of Chicago's redlining maps as found on the Mapping Inequality resource. It shows that most neighborhoods in southern and central Chicago were deemed "Definitely Declining" or "Hazardous" by redlining maps in the 1950s, whereas suburbs to the north and outskirts of the city were deemed "Best" or "Still Desirable". These designations were highly racially charged and illustrate the legacy of segregation on racial breakdowns in Chicago today.

Screen%20Shot%202021-08-23%20at%202.23.31%20PM.png

Data and Data Processing

Import Packages

Data Wrangling

I loaded in Chicago's COVID data by zip code and merged it with geospatial data about each zip code.

I then chose which columns I was interested in and created a new dataframe with those columns. I then isolated data from the first week of August (the most recent week when I started working on this project). I created a column for cumulative cases per 10,000 people and made the dataframe into a geodataframe.

I loaded sidewalk data from Cook County (Chicago's county) into a geodataframe as well.

I set all geodataframes to EPSG 3435 (NAD83 / Illinois East (ftUS)).

I clipped my sidewalk data by the COVID cases (Chicago boundaries).

Then, I loaded Census tract geospatial data into a geodataframe.

Census demographic data was unable to load in via link. Instead, I found a table here: https://data.census.gov/cedsci/advanced?q=chicago. I downloaded tract-level race data for all of Chicago to my local computer.

I created columns for percent of the population that is white and percent of the population that is of color. I made a new dataframe called "simpleracedf" with these columns, then merged that dataframe back into my Census geodataframe.

Below is my final Census tract geodataframe. It includes columns for percent of the population that is white and percent of the population that is of color.

Below is my final COVID cases geodataframe. It includes cumulative cases per 10,000 people for each zip code.

Below is my final testing sites geodataframe.

Data Analysis

I pulled one testing site out of my testing sites geodataframe to try and complete my network analysis. I ultimately was unable to perform network analysis, but I kept the small sidewalk visualizations I made to give a glimpse into my process and to show the sidewalk data that I loaded into the notebook.

I tested a buffer on the sample site once I realized I would be unable to complete network analysis.

I decided to go with a half mile buffer for each testing site to get a sense of areas that have readily accessible testing sites.

I then created 'near' and 'not near' layers based on the buffers and merged them together. That way, I have one layer with polygons that are near and not near testing sites. This layer will be used for ANOVA analysis.

Visualization

Analysis and Results

My results indicate that the areas near Chicago's COVID-19 testing sites (within a half mile) are on average about 4% whiter than areas not near testing sites. This difference is small but statistically significant, with a p-value of 0.024.

I did not find any statistically significant difference between the COVID case rate near testing sites and not near testing sites. There was an average of just 3 more cases per 10,000 people in the "not near" group, and the p-value of 0.922 clearly indicates no relationship.

Conclusions

A large benefit of this project was the discovery of a statistically significant difference between racial breakdowns near and not near COVID testing sites. I found that areas with accessible testing sites are 4% whiter than areas without accessible testing. This finding is an example of the allocation of health resources being disparate along racial lines, which has of course contributed to disparate impacts of the pandemic.

Though I did not find a statistically significant difference between COVID case rates near and not near testing sites, this finding is not too surprising. It illustrates that, perhaps, Chicago did not purposely add testing sites to areas that had more COVID cases. That may be an area for future work for health officials in cities.

One large limitation of this project was the method of calculating what is "near" a testing site. I wanted to create service areas using network analysis, but the timescale of the project and available open source packages did not allow for this method. Instead, I used half-mile buffers. These buffers can still give us a sense of what is within walking distance, especially since Chicago's sidewalks are close to a uniform grid, but they do not fully encompass what it means for a location to be accessible. People may access COVID-19 testing by driving, biking, using public transportation, getting a ride from a friend, or (now that over the counter rapid tests are available) by ordering pharmacy delivery. I acknowledge this limitation of my analysis but still find it useful to investigate the physical locations of testing sites and the demographics of people who live near them.

Another limitation is the way that I measured COVID distribution across Chicago. The most granular data available was at the zip code level, and it would have been helpful to have data at the Census tract level, especially to stay consistent with my demographic data. Additionally, I chose to use the measure of cumulative cases for each zip code and create a measure of cases-to-date per 10,000. This measure does not reflect fluctuations throughout the pandemic or movements of testing sites throughout the city. It is, however, a snapshot of the cumulative impact of COVID throughout the city.

References

Benitez, J., Courtemanche, C. & Yelowitz, A. (2020). Racial and Ethnic Disparities in COVID-19: Evidence from Six Large Cities. J Econ Race Policy 3, 243–261. https://doi.org/10.1007/s41996-020-00068-9.

Nelson, R., Madron, J., Ayers, N., Winling, L., Marciano, R., Jansen, G., Lee, M., Shah, S., Kending, M. & Connolly, N. D. B. Mapping Inequality: Redlining in New Deal America. University of Richmond’s Digital Scholarship Lab. https://dsl.richmond.edu/panorama/redlining/#loc=5/39.1/-94.58.

Onésimo Sandoval, J. S. (2011). Neighborhood Diversity and Segregation in the Chicago Metropolitan Region, 1980-2000. Urban Geography, 32(5), 609-640. https://doi.org/10.2747/0272-3638.32.5.609.